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Abstract 

We use ideas from distributed computing to study dynamic environments in which computa- 
tional nodes, or decision makers, follow adaptive heuristics [16| . i.e., simple and unsophisticated 
rules of behavior, e.g., repeatedly "best replying" to others' actions, and minimizing "regret", 
that have been extensively studied in game theory and economics. We explore when convergence 
of such simple dynamics to an equilibrium is guaranteed in asynchronous computational envi- 
ronments, where nodes can act at any time. Our research agenda, distributed computing with 
adaptive heuristics, lies on the borderline of computer science (including distributed computing 
and learning) and game theory (including game dynamics and adaptive heuristics). We exhibit 
a general non-termination result for a broad class of heuristics with bounded recall — that is, 
simple rules of behavior that depend only on recent history of interaction between nodes. We 
consider implications of our result across a wide variety of interesting and timely applications: 
game theory, circuit design, social networks, routing and congestion control. We also study 
the computational and communication complexity of asynchronous dynamics and present some 
basic observations regarding the effects of asynchrony on no-regret dynamics. We believe that 
our work opens a new avenue for research in both distributed computing and game theory. 
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1 Introduction 



Dynamic environments where computational nodes, or decision makers, repeatedly interact arise in 
a variety of settings, such as Internet protocols, large-scale markets, social networks, multi-processor 
computer architectures, and more. In many such settings, the prescribed behavior of the nodes is 
often simple, natural and myopic (that is, a heuristic or "rule of thumb"), and is also adaptive, in 
the sense that nodes constantly and autonomously react to others. These "adaptive heuristics" — a 
term coined in [16] — include simple behaviors, e.g., repeatedly "best replying" to others' actions, 
and minimizing "regret" , that have been extensively studied in game theory and economics. 

Adaptive heuristics are simple and unsophisticated, often reflecting either the desire or necessity 
for computational nodes (whether humans or computers) to provide quick responses and have a 
limited computational burden. In many interesting contexts, these adaptive heuristics can, in the 
long run, move the global system in good directions and yield highly rational and sophisticated 
behavior, such as in game theory results demonstrating the convergence of best-response or no- 
regret dynamics to equilibrium points (see [16j and references therein). 

However, these positive results for adaptive heuristics in game theory are, with but a few ex- 
ceptions (see Section [2]), based on the sometimes implicit and often unrealistic premise that nodes' 
actions are somehow synchronously coordinated. In many settings, where nodes can act at any 
time, this kind of synchrony is not available. It has long been known that asynchrony introduces 
substantial difficulties in distributed systems, as compared to synchrony [12], due to the "limitation 
imposed by local knowledge" [23]. There has been much work in distributed computing on identify- 
ing conditions that guarantee protocol termination in asynchronous computational environments. 
Over the past three decades, we have seen many results regarding the possibility/impossibility bor- 
derline for failure-resilient computation [111124]. In the classical results of that setting, the risk of 
non-termination stems from the possibility of failures of nodes or other components. 

We seek to bring together these two areas to form a new research agenda on distributed com- 
puting with adaptive heuristics. Our aim is to draw ideas from distributed computing theory to 
investigate provable properties and possible worst-case system behavior of adaptive heuristics in 
asynchronous computational environments. We take the first steps of this research agenda. We 
show that a large and natural class of adaptive heuristics fail to provably converge to an equilib- 
rium in an asynchronous setting, even if the nodes and communication channels are guaranteed to 
be failure-free. This has implications across a wide domain of applications: convergence of game 
dynamics to pure Nash equilibria; stabilization of asynchronous circuits; convergence to a stable 
routing tree of the Border Gateway Protocol, that handles Internet routing; and more. We also 
explore the impact of scheduling on convergence guarantees. We show that non-convergence is 
not inherent to adaptive heuristics, as some forms of regret minimization provably converge in 
asynchronous settings. In more detail, we make the following contributions: 

General non-convergence result (Section [4]). It is often desirable or necessary due to prac- 
tical constraints that computational nodes' {e.g., routers') behavior rely on limited memory and 
processing power. In such contexts, nodes' adaptive heuristics are often based on bounded recall — 
i.e., depend solely on recent history of interaction with others — and can even be historyless — i.e., 
nodes only react to other nodes' current actions). We exhibit a general impossibility result using 
a valency argument — a now-standard technique in distributed computing theory [11[|24] — to show 
that a broad class of bounded-recall adaptive heuristics cannot always converge to a stable state. 
More specifically, we show that, for a large family of such heuristics, simply the existence of two 
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"equilibrium points" implies that there is some execution that does not converge to any outcome 
even if nodes and communication channels are guaranteed not to fail. We also give evidence that 
our non-convergence result is essentially tight. 

Implications across a wide variety of interesting and timely applications (Section [5]). 

We apply our non-convergence result to a wide variety of interesting environments, namely conver- 
gence of game dynamics to pure Nash equilibria, stabilization of asynchronous circuits, diffusion of 
technologies in social networks, routing on the Internet, and congestion control protocols. 
Implications for convergence of r- fairness and randomness (Section [6]). We study the 
effects on convergence to a stable state of natural restrictions on the order of nodes' activations (i.e., 
the order in which nodes' have the opportunity to take steps), that have been extensively studied 
in distributed computing theory: (1) r-fairness, which is the guarantee that each node selects a 
new action at least once within every r consecutive time steps, for some pre-specified r > 0; and 
(2) randomized selection of the initial state of the system and the order of nodes' activations. 
Communication and computational complexity of asynchronous dynamics (Section [7]). 
We study the tractability of determining whether convergence to a stable state is guaranteed. We 
present two complementary hardness results that establish that, even for extremely restricted kinds 
of interactions, this feat is hard: (1) an exponential communication complexity lower bound; and 
(2) a computational complexity PSPACE-completeness result that, alongside its computational 
implications, implies that we cannot hope to have short witnesses of guaranteed asynchronous 
convergence (unless PSPACE C NP). 

Asynchronous no-regret dynamics (Section [8]). We present some basic observations about 
the convergence properties of no-regret dynamics in our framework, that establish that, in contrast 
to other adaptive heuristics, regret minimization is quite robust to asynchrony. 
Further discussion of a research agenda in distributed computing with adaptive heuris- 
tics (Section [9]) We believe that this work has but scratched the surface in the exploration of 
the behavior of adaptive heuristics in asynchronous computational environments. Many important 
questions remain wide open. We present context-specific problems in the relevant sections, and 
also outline general interesting directions for future research in Section [9l 

Before presenting our main results, we overview related work (Section [2]) and provide a detailed 
description of our model (Section [3]). 

2 Related Work 

Our work relates to many ideas in game theory and in distributed computing. We discuss game 
theoretic work on adaptive heuristics and on asynchrony, and also distributed computing work on 
fault tolerance and self stabilization. We also highlight the application areas we consider. 
Adaptive heuristics. Much work in game theory and economics deals with adaptive heuristics 
(see Hart [16j and references therein). Generally speaking, this long line of research explores the 
"convergence" of simple and myopic rules of behavior {e.g., best-response/fictitious-play/no-regret 
dynamics) to an "equilibrium". However, with few exceptions (see below), such analysis has so 
far primarily concentrated on synchronous environments in which steps take place simultaneously 
or in some other predetermined prescribed order. In contrast, we explore adaptive heuristics in 
asynchronous environments, which are more realistic for many applications. 

Game-theoretic work on asynchronous environments. Some game-theoretic work on re- 
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peated games considers "asynchronous moves''!^ (see |23p34j . among others, and references therein). 
Such work does not explore the behavior of dynamics, but has other research goals {e.g., charac- 
terizing equilibria, establishing Folk theorems). We are, to the best of our knowledge, the first to 
study the effects of asynchrony (in the broad distributed computing sense) on the convergence of 
game dynamics to equilibria. 

Fault-tolerant computation. We use ideas and techniques from work in distributed computing 
on protocol termination in asynchronous computational environments where nodes and commu- 
nication channels are possibly faulty. Protocol termination in such environments, initially moti- 
vated by multi-processor computer architectures, has been extensively studied in the past three 
decades [2l[4l [7t[T2 | l20 p 29j . as nicely surveyed in [TTl[24j . Fischer, Lynch and Paterson [12] showed, in 
a landmark paper, that a broad class of failure-resilient consensus protocols cannot provably termi- 
nate. Intuitively, the risk of protocol nontermination in [I2j stems from the possibility of failures; a 
computational node cannot tell whether another node is silent due to a failure or is simply taking 
a long time to react. Our focus here is, in contrast, on failure- free environments. 
Self stabilization. The concept of self stabilization is fundamental to distributed computing and 
dates back to Dijkstra, 1973 (see [8j and references therein). Convergence of adaptive heuristics to 
an "equilibrium" in our model can be viewed as the self stabilization of such dynamics (where the 
"equilibrium points" are the legitimate configurations). Our formulation draws ideas from work in 
distributed computing {e.g., Burns' distributed daemon model) and in networking research [14j on 
self stabilization. 

Applications. We discuss the implications of our non-convergence result across a wide variety of 
applications, that have previously been studied: convergence of game dynamics (see, e.g., p^lTO]): 
asynchronous circuits (see, e.g., [6]); diffusion of innovations, behaviors, etc., in social networks (see 
Morris [26j and also [.21j); interdomain routing |14p30j: and congestion control |13j . 

3 The Model 

We now present our model for analyzing adaptive heuristics in asynchronous environments. 
Computational nodes interacting. There is an interaction system with n computational nodes, 
l,...,n. Each computational node i has an action space A^. Let A = Xjg[„]Aj, where [n] = 
{1, . . . Let A-i = Xjg[„]\^|jj^j. Let A(^j) be the set of all probability distributions over the 
actions in A^. 

Schedules. There is an infinite sequence of discrete time steps t = 1, . . .. A schedule is a function 
a that maps each t G N+ = {1,2,...} to a nonempty set of computational nodes: a{t) C [n]. 
Informally, a determines (when we consider the dynamics of the system) which nodes are activated 
in each time-step. We say that a schedule a is fair if each node i is activated infinitely many times 
in £7, i.e.. Mi G [n], there are infinitely many t G N+ such that i G cr{t). For r G N+, we say that a 
schedule a is r-fair if each node is activated at least once in every sequence of r consecutive time 
steps, i.e., if, for every i G [n] and to G N+, there is at least one value t G {to, io + 1, • • • , + — 1} 
for which i G cr{t). 

History and reaction functions. Let Hq = 0, and let Ht = A^ for every t > 1. Intuitively, an 

^Often, the term asynchrony merely indicates that players are not all activated at each time step, and thus is 
used to describe environments where only one player is activated at a time ("alternating moves"), or where there is 
a probability distribution that determines who is activated when. 
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element in Ht represents a possible history of interaction at time step t. For each node i, there 
is an infinite sequence of functions fi = /(j,2); • • • i /(j,*)? • • •) such that, for each t G N+, 

/(j J) : Ht — )■ A{Ai); we call fi the reaction function of node i. As discussed below, fi captures i's 
way of responding to the history of interaction in each time step. 

Restrictions on reaction functions. We now present five possible restrictions on reaction 
functions: determinism, self-independence, bounded recall, stationarity and history lessness. 

1. Determinism: a reaction function fi is deterministic if, for each input, fi outputs a single 
action (that is, a probability distribution where a single action in Ai has probability 1). 

2. Self-independence: a reaction function fi is self-independent if node i's own (past and 
present) actions do not affect the outcome of fi. That is, a reaction function fi is self- 
independent if for every t>l there exists a function gt : A^_- — >■ A(Aj) such that /(j = gt- 

3. A;-recall and stationarity: a node i has k-recall if its reaction function fi only depends on 
the k most recent time steps, i.e., for every t > k, there exists a function g : Hk — > A(ylj) 
such that f(i^t){^) = 9{^\k) for each input x G Ht {x^^ i^^re denotes the last k coordinates, i.e., 
n-tuples of actions, of x). We say that a A;-recall reaction function is stationary if the time 
counter t is of no importance. That is, a /c-rccall reaction function is stationary if there exists 
a function g : H^ A(^j) such that for all t > k, /(i,t)(x) = gix^f^.) for each input x G Ht. 

4. Historylessness: a reaction function fi is historyless if fi is 1-recall and stationary, that is, 

if fi only depends on i's and on z's neighbors' most recent actions. 

Dynamics. We now define dynamics in our model. Intuitively, there is some initial state (history 
of interaction) from which the interaction system evolves, and, in each time step, some subset of the 
nodes reacts to the past history of interaction. This is captured as follows. Let s^'^\ that shall be 
called the ^^initial state" , be an element in Hyj, for some positive it; € N. Let cj be a schedule. We 
now describe the ^\s'^'^\ a)- dynamics" . The system's evolution starts at time t = w + 1, when each 
node i G a{w + 1) simultaneously chooses an action according to /(j^^+i), i.e., node i randomizes 
over the actions in Ai according to /(j,,y+i)(s^°^). We now let s^^^ be the element in iJ^'+i for which 
the first w coordinates (n-tuples of nodes' actions) arc as in s^^^ and the last coordinate is the 
n-tuple of realized nodes' actions at the end of time step t = w + 1. Similarly, in each time step 
t > w+l, each node in a{t) updates its action according to /(j^^), based on the past history 
and nodes' realized actions at time t, combined with s^^~'^~^\ define the history of interaction at 
the end of time step t, s'^*""'). 

Convergence and convergent systems. We say that nodes' actions converge under the {s^^\a)- 
dynamics if there exist some positive to £ and some action profile a = (ai, . . . , a„), such that, for 
all t > to, s^*) = a. The dynamics is then said to converge to a, and a is called a stable state" (for 
the (s^°\ cr)-dynamics), i.e., intuitively, a stable state is a global action state that, once reached, 
remains unchanged. We say that the interaction system is convergent if, for all initial states s^^^ 
and fair schedules a, the (5*^*^^ cj)-dynamics converges. We say that the system is r-convergent if, 
for all initial states s^^^ and r-fair schedules a, the (s(°\ (T)-dynamics converges. 
Update messages. Observe that, in our model, nodes' actions are immediately observable to 
other nodes at the end of each time step perfect monitoring"). While this is clearly unrealistic in 
some important real-life contexts {e.g., some of the environments considered below), this restriction 
only strengthens our main results, that are impossibility results. 

Deterministic historyless dynamics. Of special interest to us is the case that all reaction 
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functions are deterministic and historyless. We observe that, in this case, stable states have a 
simple characterization. Each reaction function /j is deterministic and historyless and so can be 
specified by a function gi : A Ai. Let g = (gi, . . . ,gn). Observe that the set of all stable states 
(for all possible dynamics) is precisely the set of all fixed points of g. Below, when describing nodes' 
reaction functions that are deterministic and historyless we sometimes abuse notation and identify 
each fi with g^ (treating fi as a function from A io Ai). In addition, when all the reaction functions 
are also self- independent we occasionally treat each fi as a function from A-i to Ai. 

4 Non- Convergence Result 

We now present a general impossibility result for convergence of nodes' actions under bounded-recall 
dynamics in asynchronous, distributed computational environments. 

Theorem 4.1. // each reaction function has bounded recall and is self-independent then the exis- 
tence of multiple stable states implies that the system is not convergent. 

We note that this result holds even if nodes' reaction functions are not stationary and are 
randomized (randomized initial states and activations are discussed in Section [6j). We present the 
proof of Theorem 14.11 in Appendix|Fl We now discuss some aspects of our impossibility result. 
Neither bounded recall nor self-independence alone implies non-convergence We show 
that the statement of Theorem 14.11 does not hold if either the bounded-recall restriction, or the 
self-independence restriction, is removed. 

Example 4.2. (the bounded-recall restriction cannot be removed) There are two nodes, 1 
and 2, each with the action space {x, y}. The deterministic and self-independent reaction functions 
of the nodes are as follows: node 2 always chooses node I's action; node 1 will choose y if node 2's 
action changed from a; to y in the past, and x otherwise. Observe that node I's reaction function is 
not bounded-recall but can depend on the entire history of interaction. We make the observations 
that the system is safe and has two stable states. Observe that if node 1 chooses y at some point 
in time due to the fact that node 2's action changed from x to y, then it shall continue to do so 
thereafter; if, on the other hand, 1 never does so, then, from some point in time onwards, node I's 
action is constantly x. In both cases, node 2 shall have the same action as node 1 eventually, and 
thus convergence to one of the two stable states, {x,x) and {y,y), is guaranteed. Hence, two stable 
states exist and the system is convergent nonetheless 

Example 4.3. (the self-independence restriction cannot be removed) There are two nodes, 
1 and 2, each with action set {x, y}. Each node z's a deterministic and historyless reaction function 
fi is as follows: fi{x,x) = y; in all other cases the node always (re)selects its current action {e.g., 
fi{x,y) = X, f2{x,y) = y). Observe that the system has three stable states, namely all action 
profiles but (x, x), yet can easily be seen to be convergent. 

Connections to consensus protocols. We now briefly discuss the interesting connections be- 
tween Theorem 14.11 and the non-termination result for failure-resilient consensus protocols in [12] . 
We elaborate on this topic in Appendix lAl Fischer et al. [12\ explore when a group of processors 
can reach a consensus even in the presence of failures, and exhibit a breakthrough non-termination 
result. Our proof of Theorem 14. II uses a valency argument — an idea introduced in the proof of the 
non-termination result in |12j . 
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Intuitively, the risk of protocol non-termination in |12j stems from the possibility of failures; a 
computational node cannot tell whether another node is silent due to a failure or is simply taking 
a long time to react. We consider environments in which nodes/communication channels cannot 
fail, and so each node is guaranteed that all other nodes react after "sufficiently long" time. This 
guarantee makes reaching a consensus in the environment of [12] easily achievable (see Appendix[X]) . 
Unlike the results in [12], the possibility of nonconvergence in our framework stems from limitations 
on nodes' behaviors. Hence, there is no immediate translation from the result in [12j to ours (and 
vice versa). To illustrate this point, we observe that in both Example 14.21 and Example 14.31 there 
exist two stable states and an initial state from which both stable states are reachable (a "bivalent 
state" [E]), yet the system is convergent (see Appendix |A]). This should be contrasted with the 
result in |12j that establishes that the existence of an initial state from which two distinct outcomes 
are reachable implies the existence of a non-terminating execution. 

We investigate the link between consensus protocols and our framework further in Appendix [F| 
where we take an axiomatic approach. We introduce a condition — ^^Independence of Decisions'^ 
(loD) — that holds for both fault-resilient consensus protocols and for bounded-recall self-independent 
dynamics. We then factor the arguments in [T^ through loD to establish a non-termination re- 
sult that holds for both contexts, thus unifying the treatment of these dynamic computational 
environments. 

5 Games, Circuits, Networks, and Beyond 

We present implications of our impossibility result in SectionlUfor several well-studied environments: 
game theory, circuit design, social networks and Internet protocols. We now briefly summarize these 
implications, that, we believe, are themselves of independent interest. See Appendix|B]for a detailed 
exposition of the results in this section. 

Game theory. Our result, when cast into game-theoretic terminology, shows that if players' 
choices of strategies are not synchronized, then the existence of two (or more) pure Nash equilibria 
implies that a broad class of game dynamics {e.g., best-response dynamics with consistent tie- 
breaking) are not guaranteed to reach a pure Nash equilibrium. This result should be contrasted 
with positive results for such dynamics in the traditional synchronous game-theoretic environments. 

Theorem 5.1. If there are two (or more) pure Nash equilibria in a game, then all bounded-recall 
self-independent dynamics can oscillate indefinitely for asynchronous player activations. 

Corollary 5.2. // there are two (or more) pure Nash equilibria in a game, then best-response 
dynamics, and bounded-recall best-response dynamics (studied in 135^ ). with consistent tie-breaking, 
can fail to converge to an equilibrium in asynchronous environments. 

Circuits. Work on asynchronous circuits in computer architectures research explores the implica- 
tions of asynchrony for circuit design [6]. We observe that a logic gate can be regarded as executing 
an inherently historyless reaction function that is independent of the gate's past and present "state" . 
Thus, we show that our result has implications for the stabilization of asynchronous circuits. 

Theorem 5.3. // two (or more) stable Boolean assignments exist for an asynchronous Boolean 
circuit, then that asynchronous circuit is not inherently stable. 
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Social networks. Understanding the ways in which innovations, ideas, technologies, and practices, 
disseminate through social networks is fundamental to the social sciences. We consider the classic 
economic setting j26j (that has lately also been approached by computer scientists j21j) where each 
decision maker has two technologies {A, B} to choose from, and each node in the social network 
wishes to have the same technology as the majority of his "friends" (neighboring nodes in the social 
network). We exhibit a general impossibility result for this environment. 

Theorem 5.4. In every social network, the diffusion of technologies can potentially never converge 
to a stable global state. 

Networking. We consider two basic networking environments: (1) routing with the Border Gate- 
way Protocol (BGP), that is the "glue" that holds together the smaller networks that make up the 
Internet; and (2) the fundamental task of congestion control in communication networks, that is 
achieved through a combination of mechanisms on end-hosts (e.g., TCP), and on switches/routers 
{e.g., RED and WFQ). We exhibit non-termination results for both these environments. 

We abstract a recent result in [30] and prove that this result extends to several BGP-based 
multipath routing protocols that have been proposed in the past few years. 

Theorem 5.5. 130] If there are multiple stable routing trees in a network, then BGP is not safe 
on that network. 

We consider the model for analyzing dynamics of congestion presented in [13]. We present the 
following result. 

Theorem 5.6. If there are multiple capacity- allocation equilibria in the network then dynamics of 
congestion can oscillate indefinitely. 

6 r- Convergence and Randomness 

We now consider the implications for convergence of two natural restrictions on schedules: r- 
fairness and randomization. See Appendix ICl for a detailed exposition of the results in this section. 
Snakes in boxes and r-convergence. Theorem l4.1 I deals with convergence and not r-convergence, 
and thus does not impose restrictions on the number of consecutive time steps in which a node can 
be nonactive. What happens if there is an upper bound on this number, r? We now show that if 
r < n — 1 then sometimes convergence of historyless and self-independent dynamics is achievable 
even in the presence of multiple stable states (and so our impossibility result does not extend to 
this setting). 

Example 6.1. (a system that is convergent for r < n — 1 but nonconvergent for r = n — 1) 

There are n >2 nodes, 1, . . . , n, each with the action space {x, y}. Nodes' deterministic, historyless 
and self-independent reaction functions are as follows. Vi S [n], fi{x^~^) = x and fi always outputs 
y otherwise. Observe that there exist two stable states: x" and y". Observe that if r = n — 1 then 
the following oscillation is possible. Initially, only node I's action is y and all other nodes' actions 
are x. Then, nodes 1 and 2 are activated and, consequently, node I's action becomes x and node 2's 
action becomes y. Next, nodes 2 and 3 are activated, and thus 2's action becomes x and 3's action 
becomes y. Then 3, 4 are activated, then 4, 5, and so on (traversing all nodes over and over again 
in cyclic order). This goes on indefinitely, never reaching one of the two stable states. Observe 
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that, indeed, each node is activated at least once within every sequence of n — 1 consecutive time 
steps. We observe however, that if r < n — 1 then convergence is guaranteed. To see this, observe 
that if at some point in time there are at least two nodes whose action is y, then convergence to 
is guaranteed. Clearly, if all nodes' action is x then convergence to is guaranteed. Thus, an 
oscillation is possible only if, in each time step, exactly one node's action is y. Observe that, given 
our definition of nodes' reaction functions, this can only be if the activation sequence is (essentially) 
as described above, i.e., exactly two nodes are activated at a time. Observe also that this kind of 
activation sequence is impossible for r < n — 1. 

What about r > nl We use classical results in combinatorics regarding the size of a "snake-in- 
the-box" in a hypercube [1] to construct systems are r-convergent for exponentially-large r's, but 
are not convergent in general. 

Theorem 6.2. Let n G N 6e sufficiently large. There exists a system G with n nodes, in which 
each node i has two possible actions and each fi is deterministic, historyless and self-independent, 
such that G is r-convergent for r € f](2"'), but G is not (r + \)- convergent. 

We note that the construction in the proof of Theorem 16.21 is such that there is a unique stable 
state. We believe that the same ideas can be used to prove the same result for systems with 
multiple stable states but the exact way of doing this eludes us at the moment, and is left as an 
open question. 

Problem 6.3. Prove that for every sufficiently large n G N, there exists a system G with n nodes, in 
which each node i has two possible actions, each fi is deterministic, historyless and self-independent, 
and G has multiple stable states, such that G is r-convergent for r € ri(2") but G is not (r + 1)- 
convergent. 

Does random choice (of initial state and schedule) help? Theorem 14.11 tells us that, for a 
broad class of dynamics, a system with multiple stable states is nonconvergent if the initial state 
and the node-activation schedule are chosen adversarially. Can we guarantee convergence if the 
initial state and schedule are chosen at random? 

Example 6.4. (random choice of initial state and schedule might not help) There are n 
nodes, 1, . . . ,n, and each node has action space {x,y,z}. The (deterministic, historyless and self- 
independent) reaction function of each node i G {3, . . . , n} is such that fi{x^~^) = x; fi{z^~^) = z; 
and fi = y for all other inputs. The (deterministic, historyless and self-independent) reaction 
function of each node i G {1,2} is such that fi{x^~'^) = x; fi{z'^~^) = z; fi{xy^~'^) = y; fi{y^~^) = 
x; and fi = y for all other inputs. Observe that there are exactly two stable states: x" and z". 
Observe also that if nodes' actions in the initial state do not contain at least n — 1 x's, or at least 
71 — 1 z's, then, from that moment forth, each activated node in the set {3, . . . ,n} will choose the 
action y. Thus, eventually the actions of all nodes in {3, . . . , n} shall be y, and so none of the two 
stable states will be reached. Hence, there are 3" possible initial states, such that only from 4n + 2 
can a stable state be reached. When choosing the initial state uniformly at random the probability 
of landing on a "good" initial state (in terms of convergence) is thus exponentially small. 

7 Complexity of Asynchronous Dynamics 

We now explore the communication complexity and computational complexity of determining 
whether a system is convergent. We present hardness results in both models of computation even 
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for the case of deterministic and historyless adaptive heuristics. See Appendix [D] for a detailed 
exposition of the results in this section. 

We first present the following communication complexity result whose proof relies on combina- 
torial "snake-in-the-box" constructions [Ij. 

Theorem 7.1. Determining if a system with n nodes, each with 2 actions, is convergent requires 
0(2") bits. This holds even if all nodes have deterministic, historyless and self-independent reaction 
functions. 

The above communication complexity hardness result required the representation of the reaction 
functions to (potentially) be exponentially long. What if the reaction functions can be succinctly 
described? We now present a strong computational complexity hardness result for the case that 
each reaction function /j is deterministic and historyless, and is given explicitly in the form of a 
boolean circuit (for each a A the circuit outputs fi{a)). We prove the following result. 

Theorem 7.2. Determining if a system with n nodes, each with a deterministic and historyless 
reaction function, is convergent is PSPACE-complete. 

Our computational complexity result shows that even if nodes' reaction functions can be suc- 
cinctly represented, determining whether the system is convergent is PSPACE-complete. This 
result, alongside its computational implications, implies that we cannot hope to have short "wit- 
nesses" of guaranteed asynchronous convergence (unless PSPACE C NP). Proving the above 
PSPACE-completeness result for the case self-independent reaction functions seems challenging. 

Problem 7.3. Prove that determining if a system with n nodes, each with a deterministic self- 
independent and historyless reaction function, is convergent is PSPACE-complete. 

8 Some Basic Observations Regarding No-Regret Dynamics 

Regret minimization is fundamental to learning theory, and has strong connections to game- 
theoretic solution concepts; if each player in a repeated game executes a no-regret algorithm when 
selecting strategies, then convergence to an equilibrium is guaranteed in a variety of interesting con- 
texts. The meaning of convergence, and the type of equilibrium reached, vary, and are dependent 
on the restrictions imposed on the game and on the notion of regret. Work on no-regret dynamics 
traditionally considers environments where all nodes are "activated" at each time step. We make 
the simple observation that, switching our attention to r-fair schedules (for every r S N^), if an 
algorithm has no regret in the classic setting, then it has no regret in this new setting as well (for 
all notions of regret). Hence, positive results from the regret-minimization literature extend to this 
asynchronous environment. See [3] for a thorough explanation about no-regret dynamics and see 
Appendix |E] for a detailed explanation about our observations. We now mention two implications 
of our observation and highlight two open problems regarding regret minimization. 

Observation 8.1. When all players in a zero-sum game use no- external-regret algorithms then 
approaching or exceeding the minimax value of the game is guaranteed. 

Observation 8.2. When all players in a (general) game use no-swap-regret algorithms the empir- 
ical distribution of joint players' actions converges to a correlated equilibrium of the game. 
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Problem 8.3. Give examples of repeated games for which there exists a schedule of player activa- 
tions that is not r -fair for any r G A^_|_ for which regret-minimizing dynamics do not converge to an 
equilibrium (for different notions of regret/convergence/equilibria). 

Problem 8.4. When is convergence of no-regret dynamics to an equilibrium guaranteed (for dif- 
ferent notions of regret/convergence/equilibria) for all r-fair schedules for non-fixed r's, that is, if 
when r is a function of t? 

9 Future Research 

In this paper, we have taken the first steps towards a complete understanding of distributed com- 
puting with adaptive heuristics. We proved a general non-convergence result and several hardness 
results within this model, and also discussed some important aspects such as the implications of 
fairness and randomness, as well as applications to a variety of settings. We believe that we have 
but scratched the surface in the exploration of the convergence properties of simple dynamics in 
asynchronous computational environments, and many important questions remain wide open. We 
now outline several interesting directions for future research. 

Other heuristics, convergence notions, equilibria. We have considered specific adaptive 
heuristics, notions of convergence, and kinds of equilibria. Understanding the effects of asynchrony 
on other adaptive heuristics {e.g., better-response dynamics, fictitious play), for other notions 
of convergence [e.g., of the empirical distributions of play), and for other kinds of equilibria {e.g., 
mixed Nash equilibria, correlated equilibria) is a broad and challenging direction for future research. 
Outdated and private information. We have not explicitly considered the effects of making 
decisions based on outdated information. We have also not dealt with the case that nodes' behaviors 
are dependent on private information, that is, the case that the dynamics are "uncoupled" |18ill9j . 
Other notions of asynchrony. We believe that better understanding the role of degrees of 
fairness, randomness, and other restrictions on schedules from distributed computing literature, in 
achieving convergence to equilibrium points is an interesting and important research direction. 
Characterizing asynchronous convergence. We still lack characterizations of asynchronous 
convergence even for simple dynamics {e.g., deterministic and historyless) H 

Topological and know^ledge-based approaches. Topological [¥|20ll29j and knowledge-based [T5] 
approaches have been very successful in addressing fundamental questions in distributed computing. 
Can these approaches shed new light on the implications of asynchrony for adaptive heuristics? 
Further exploring the environments in Section [5]. We have applied our non-convergence 
result to the environments described in Section [H These environments are of independent interest 
and are indeed the subject of extensive research. Hence, the further exploration of dynamics in 
these settings is important. 
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A Connections to Consensus Protocols 

There are interesting connections between our result and that of Fischer et al. for fault-resilient 
consensus protocols. [12] studies the following environment: There is a group of processes, each with 
an initial value in {0, 1}, that communicate with each other via messages. The objective is for all 
non-faulty processes to eventually agree on some value x G {0, 1}, where the "consensus" x must 
match the initial value of some process. jl2l establishes that no consensus protocol is resilient to 
even a single failure. One crucial ingredient for the proof of the result in [12] is showing that there 
exists some initial configuration of processes' initial values such that, from that configuration, 
the resulting consensus can be both and 1 (the outcome depends on the specific "schedule" 
realized). Our proof of Theorem 14.11 uses a valency argument — an idea introduced in the proof of 
the breakthrough non-termination result in [12] for consensus protocols. 

Intuitively, the risk of protocol nontermination in [12j stems from the possibility of failures; a 
computational node cannot tell whether another node is silent due to a failure or is simply taking a 
long time to react. We consider environments in which nodes/communication channels do not fail. 
Thus, each node is guaranteed that after "sufficiently many" time steps all other nodes will react. 
Observe that in such an environment reaching a consensus is easy; one pre-specified node i (the 
"dictator") waits until it learns all other nodes' inputs (this is guaranteed to happen as failures are 
impossible) and then selects a value Vi and informs all other nodes; then, all other nodes select Vi. 
Unlike the results in |12] , the possibility of nonconvergence in our framework stems from limitations 
on nodes' behaviors. We investigate the link between consensus protocols and our framework further 
in Appendix. [F1 where we take an axiomatic approach. We introduce a condition — ^^Independence 
of Decisions" (loD) — that holds for both fault-resilient consensus protocols and for bounded-recall 
self-independent dynamics. We then factor the arguments in ^I2j through loD to establish a non- 
termination result that holds for both contexts, thus unifying the treatment of these dynamic 
computational environments. 

Hence, there is no immediate translation from the result in [12j to ours (and vice versa). To 
illustrate this point, let us revisit Example 14.21 hi which the system is convergent, yet two stable 
states exist. We observe that in the example there is indeed an initial state from which both stable 
states are reachable (a "bivalent state" |12]). Consider the initial state {y,x). Observe that if node 
1 is activated first (and alone), then it shall choose action x. Once node 2 is activated it shall then 
also choose x, and the resulting stable state shall be {x,x). However, if node 2 is activated first 
(alone), then it shall choose action y. Once 1 is activated it shall also choose action y, and the 
resulting stable state shall be {y,y). Observe that in Example 14.31 too there exists an action profile 
(x, x) from which multiple stable states are reachable yet the system is convergent. 
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B Games, Circuits, Networks, and Beyond 



We present implications of our impossibility result in SectionlUfor several well-studied environments: 
game theory, circuit design, social networks and Internet protocols. 

B.l Game Dynamics 

The setting. There are n players, 1, . . . ,n. Each player i has a strategy set 5j. Let S = Xj^j^Sj, 
and let S-i = ^j£[n]\{i}Sj- Each player i has a utility function Ui : S ^ Si. For each Si G Si and 
S-i € S-i let {si,S-i) denote the strategy profile in which player i's strategy is Si and all other 
players' strategies are as in Informally, a pure Nash equilibrium is a strategy profile from which 
no player wishes to unilaterally deviate. 

Definition B.l. (pure Nash equilibria) We say that a strategy profile s = {'si, . . . , s^) € 5" is 

a pure Nash equilibrium if, for each player i,'si £ argmaXs^eSiUi{si,'s^)- 

One natural procedure for reaching a pure Nash equilibrium of a game is best-response dynamics: 
the process starts at some arbitrary strategy profile, and players take turns "best replying" to 
other players' strategies until no player wishes to change his strategy. Convergence of best-response 
dynamics to pure Nash equilibria is the subject of extensive research in game theory and economics, 
and both positive [25l[28] and negative [I8l[l9] results are known. 

Traditionally, work in game theory on game dynamics {e.g., best-response dynamics) relies on 
the explicit or implicit premise that players' actions are somehow synchronized (in some contexts 
play is sequential, while in others it is simultaneous). We consider the realistic scenario that there 
is no computational center than can synchronize players' selection of strategies. We cast the above 
setting into the terminology of Section [3] and exhibit an impossibility result for best-response, and 
more general, dynamics. 

Computational nodes, action spaces. The computational nodes are the n players. The action 
space of each player i is his strategy set Si. 

Reaction functions, dynamics. Under best-response dynamics, each player constantly chooses 
a "best response" to the other players' most recent actions. Consider the case that players have 
consistent tie-breaking rules, i.e., the best response is always unique, and depends only on the 
others' strategies. Observe that, in this case, players' behaviors can be formulated as deterministic, 
historyless, and self-independent reaction functions. The dynamic interaction between players is as 
in Section [3l 

Existence of multiple pure Nash equilibria implies non-convergence of best-response 
dynamics in asynchronous environments. Theorem 14.11 implies the following result: 

Theorem B.2. If there are two (or more) pure Nash equilibria in a game, then asynchronous 
best-response dynamics can potentially oscillate indefinitely. 

In fact. Theorem 14. 1 1 implies that the above non-convergence result holds even for the broader 
class of randomized, bounded-recall and self-independent game dynamics, and thus also to game 
dynamics such as best-response with bounded recall and consistent tie-breaking rules (studied 
in [35]). 
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B.2 Asynchronous Circuits 

The setting. There is a Boolean circuit, represented as a directed graph G, in which vertices 
represent the circuit's inputs and the logic gates, and edges represent connections between the 
circuit's inputs and the logic gates and between logic gates. The activation of the logic gates 
is asynchronous. That is, the gates' outputs are initialized in some arbitrary way, and then the 
update of each gate's output, given its inputs, is uncoordinated and unsynchronized. We prove an 
impossibility result for this setting, which has been extensively studied (see [6]). 

Computational nodes, action spaces. The computational nodes are the inputs and the logic 
gates. The action space of each node is {0, 1}. 

Reaction functions, dynamics. Observe that each logic gate can be regarded as a function that 
only depends on its inputs' values. Hence, each logic gate can be modeled via a reaction function. 
Interaction between the different circuit components is as in Section [3l 

Too much stabihty in circuits can lead to instability. Stable states in this framework are 
assignments of Boolean values to the circuit inputs and the logic gates that are consistent with 
each gate's truth table (reaction function). We say that a Boolean circuit is inherently stable if it is 
guaranteed to converge to a stable state regardless of the initial boolean assignment. The following 
theorem is derived from Theorem 14.11 

Theorem B.3. // two (or more) stable Boolean assignments exist for an asynchronous Boolean 
circuit, then that asynchronous circuit is not inherently stable. 

B.3 Diffusion of Technologies in Social Networks 

The setting. There is a social network of users, represented by a directed graph in which users are 
the vertices and edges correspond to friendship relationships. There are two competing technologies, 
X and Y. A user's utility from each technology depends on the number of that user's friends that 
use that technology; the more friends use that technology the more desirable that technology is to 
the user. That is, a user would always select the technology used by the majority of his friends. We 
are interested in the dynamics of the diffusion of technologies. Observe that if, initially, all users 
are using X, or all users are using Y, no user has an incentive to switch to a different technology. 
Hence, there are always (at least) two distinct "stable states" (regardless of the topology of the 
social network). Therefore, the terminology of Section [3] can be applied to this setting. 

Computational nodes, actions spaces. The users are the computational nodes. Each user z's 
action space consists of the two technologies {X, y}. 

Reaction functions, dynamics. The reaction function of each user i is defined as follows: If at 
least half of i's friends use technology X, i selects technology X; otherwise, i selects technology Y. 
In our model of diffusion of technologies, users' choices of technology can be made simultaneously, 
as described in Section [3l 

Instability of social networks. Theorem 14. II implies the following: 

Theorem B.4. In every social network, the diffusion of technologies can potentially never converge 
to a stable global state. 
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B.4 Interdomain Routing 

The setting. The Internet is made up of smaller networks called Autonomous Systems (ASes). 
Interdomain routing is the task of establishing routes between ASes, and is handled by the Border 
Gateway Protocol (BGP). In the standard model for analyzing BGP dynamics [H], there is a 
network of source ASes that wish to send traffic to a unique destination AS d. Each AS i has 
a ranking function <j that specifies z's strict preferences over all simple (loop-free) routes leading 
from i to Under BGP, each AS constantly selects the "best" route that is available to it. See |14] 
for more details. Guaranteeing BGP safety, i.e., BGP convergence to a "stable" routing outcome is 
a fundamental desideratum that has been the subject of extensive work in both the networking and 
the standards communities. We now cast interdomain routing into the terminology of Section [3l 
We then obtain non-termination results for BGP and for proposals for new interdomain routing 
protocols (as corollaries of Theorem 14. ip . 

Computational nodes, action spaces. The ASes are the computational nodes. The action space 
of each node i, Ai, is the set of all simple (loop- free) routes between i and the destination d that 
are exportable to i, and the empty route 0. 

Reaction functions, dynamics. The reaction function fi of node i outputs, for every vector a 
containing routes to d of all of i's neighbors, a route {i,j)Rj such that (1) j is i's neighbor; (2) 
Rj is j's route in a; and (3) Rj >i R for all other routes R in a. If there is no such route Rj in 
a then fi outputs 0. Observe that the reaction function fi is deterministic, self-independent and 
historyless. The interaction between nodes is as described in Section [3j 

The multitude of stable routing trees implies global netvi^ork instability. Theorem 14.11 
implies a recent result of Sami et al. [30], that shows that the existence of two (or more) stable 
routing trees to which BGP can (potentially) converge implies that BGP is not safe. Importantly, 
the asynchronous model of Section [3] is significantly more restrictive than that of [30]. Hence, 
Theorem 14.11 implies the non-termination result of Sami et al. 

Theorem B.5. ^_30j If there are multiple stable routing trees in a network, then BGP is not safe 
on that network. 

Over the past few years, there have been several proposals for BGP-based multipath routing 
protocols, i.e., protocols that enable each node (AS) to send traffic along multiple routes, e.g., 
R-BGP ^22] and Neighbor-Specific BGP [33j (NS-BGP). Under both R-BGP and NS-BGP each 
computational node's actions are independent of its own past actions and are based on bounded 
recall of past interaction. Thus, Theorem 14.11 implies the following: 

Theorem B.6. If there are multiple stable routing configurations in a network, then R-BGP is not 
safe on that network. 

Theorem B.7. // there are multiple stable routing configurations in a network, then NS-BGP is 
not safe on that network. 

B.5 Congestion Control 

The setting. We now present the model of congestion control, studied in [13j. There is a network 
of routers, represented by a directed graph G = {V,E), where \E\ > 2, in which vertices represent 

''ASes rankings of routes also reflect each AS's export policy that specifles which routes that AS is wilhng to make 
available to each neighboring AS. 
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routers, and edges represent communication links. Each edge has capacity Cg. There are n source- 
target pairs of vertices (sj,tj), termed ^^connections" , that represent communicating pairs of end- 
hosts. Each source-target pair (sj, ti) is connected via some fixed route, Ri. Each source Sj transmits 
at a constant rate 7^ > Oo Routers have queue management, or queueing, pohcies, that dictate 
how traffic traversing a router's outgoing edge should be divided between the connections whose 
routes traverse that edge. The network is asynchronous and so routers' queueing decisions can be 
made simultaneously. See [13] for more details. 

Computational nodes, action spaces The computational nodes are the edges. The action space 
of each edge e intuitively consists of all possible way to divide traffic going through e between 
the connections whose routes traverse e. More formally, for every edge e, let A^(e) be the number 
connections whose paths go through e. e's action space is then Ai = {x = (xi, . . . , 3;^(e))|xi € 

M^q'^'* and SjXj < Cg}. 

Reaction functions, dynamics. Each edge e's reaction function, /e, models the queueing policy 
according to which e's capacity is shared: for every A^(e)-tuple of nonnegative incoming flows 
{wi,W2, ■ ■ ■ ,W]\[(^f,)), fe outputs an action (xi, . . . , x^(e)) G Ai such that Vi € [-/V(e)] Wi > Xi (a 
connection's flow leaving the edge cannot be bigger than that connection's flow entering the edge). 
The interaction between the edges is as described in Section [3l 

Multiple equilibria imply potential fluctuations of connections' throughputs. [13] shows 
that, while one might expect that if sources transmit flow at a constant rate, flow will also be 
received at a constant rate, this is not necessarily the case. Indeed, |13] presents examples in which 
connections' throughputs can potentially fluctuate ad infinitum. Equilibria (which correspond to 
stable states in Section [3|), are global configurations of connections' flows on edges such that connec- 
tions' incoming and outgoing flows on each edge are consistent with the queue management policy 
of the router controlling that edge. Using Theorem 14. H we can obtain the following impossibility 
result: 

Theorem B.8. If there are multiple capacity- allocation equilibria in the network then dynamics of 
congestion can potentially oscillate indefinitely. 

C r- Convergence and Randomness 

We now consider the implications for convergence of two natural restrictions on schedules: 
r-fairness and randomization. 

C.l Snakes in Boxes and r-Convergence. 

Theorem 14.11 deals with convergence and not r-convergence, and thus does not impose restrictions 
on the number of consecutive time steps in which a node can be nonactive. What happens if there 
is an upper bound on this number, r? We now show that if r < n — 1 then sometimes convergence 
of historyless and self-independent dynamics is achievable even in the presence of multiple stable 
states (and so our impossibility result breaks). 

''This is modeled via the addition of an edge e = (u, Si) to G, such that Ce = 7i, and u has no incoming edges. 
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Example C.l. (a system that is convergent for r < n — 1 but nonconvergent for r = n — 1) 

There are n > 2 nodes, 1, . . . , n, each with the action space {x, y}. Nodes' deterministic, historyless 
and self- independent reaction functions are as fohows. Vi € [n], fi{x"'~^) = x and fi always outputs 
y otherwise. Observe that there exist two stable states: x" and y". Observe that if r = n — 1 then 
the following oscillation is possible. Initially, only node I's action is y and all other nodes' actions 
are x. Then, nodes 1 and 2 are activated and, consequently, node I's action becomes x and node 2's 
action becomes y. Next, nodes 2 and 3 are activated, and thus 2's action becomes x and 3's action 
becomes y. Then 3,4 are activated, then 4,5, and so on (traversing all nodes over and over again 
in cyclic order). This goes on indefinitely, never reaching one of the two stable states. Observe 
that, indeed, each node is activated at least once within every sequence of n — 1 consecutive time 
steps. We observe however, that if r < n — 1 then convergence is guaranteed. To see this, observe 
that if at some point in time there are at least two nodes whose action is y, then convergence to 
y"" is guaranteed. Clearly, if all nodes' action is x then convergence to is guaranteed. Thus, an 
oscillation is possible only if, in each time step, exactly one node's action is y. Observe that, given 
our definition of nodes' reaction functions, this can only be if the activation sequence is (essentially) 
as described above, i.e., exactly two nodes are activated at a time. Observe also that this kind of 
activation sequence is impossible for r < n — 1. 

What about r > n? We use classical results in combinatorics regarding the size of a ^^snake-in- 
the-box" in a hypercube |T] to show that some systems are r-convergent for exponentially-large r's, 
but are not convergent in general. 

Theorem 1 6. S\ 1. Let n G N be sufficiently large. There exists a system G with n nodes, in which 
each node i has two possible actions and each fi is deterministic, historyless and self-independent, 
such that 

1. G is r-convergent for r G $7(2"); 

2. G is not (r + l)-convergent. 

Proof. Let the action space of each of the n nodes be {x, y}. Consider the possible action profiles 
of nodes 3, . . . , n, i.e., the set {x, y}"^^. Observe that this set of actions can be regarded as the 
(n — 2)-hypercube Qn-2, and thus can be visualized as the graph whose vertices are indexed by the 
binary (n — 2)-tuples and such that two vertices are adjacent iff the corresponding (n — 2)-tuples 
differ in exactly one coordinate. 

Definition C.2. (chordless paths, snakes) A chordless path in a hypercube Qn is a path P = 
{vq, . . . ,Vw) such that for each Vi,Vj on P, if Vi and Vj are neighbors in Qn then vj G {uj-i, Vj+i}. 
A snake in a hypercube is a simple chordless cycle. 

The following result is due to Abbot and Katchalski [T]. 

Theorem C.3. /i/ Let t G N 6e sufficiently large. Then, the size \S\ of a maximal snake in the 
z-hypercuhe Qz is at least A x 2^ for some A > 0.3. 

Hence, the size of a maximal snake in the Qn-2 hypercube is Q(2"). Let 5 be a maximal snake 
in {x,y}^~'^. W.l.o.g we can assume that x""^ is on S (otherwise we can rename nodes' actions so 
as to achieve this). Nodes deterministic, historyless and self-independent are as follows: 

• Node i G {1,2}: fi{x^~^) = x; fi = y otherwise. 
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• Node i € {3, . . . ,n}: if the actions of nodes 1 and 2 are both y then the action y is chosen, 
i.e., fi{yy *...*) = y; otherwise, fi only depends on the actions of nodes in {3, . . . ,n} and 
therefore to describe fi it suffices to orient the edges of the hypercube Qn-2 (an edge from 
one vertex to another vertex that differs from it in the zth coordinate determines the outcome 
of fi for both). This is done as follows: orient the edges in 5 so as to create a cycle (in one 
of two possible ways); orient edges between vertices not in S to vertices in S towards the 
vertices in S; orient all other edges arbitrarily. 

Observation C.4. x" is the unique stable state of the system. 

Observation C.5. //, at some point in time, both nodes 1 and 2's actions are y then convergence 
to the stable state is guaranteed. 

Claim C.6. // there is an oscillation then there must be infinitely many time steps in which the 
actions of nodes 2, . . . , n are x^~^ . 

Proof. Consider the case that the statement does not hold. In that case, from some moment forth, 
node 1 never sees the actions x""^ and so will constantly select the action y. Once that happens, 
node 2 shall also not see the actions x""^ and will thereafter also select y. Convergence to y" is 
then guaranteed. □ 

We now show that the system is convergent for r < l^l, but is nonconvergent if r = The 
theorem follows. 

Claim C.7. If r < \S\ then convergence to the stable state is guaranteed. 

Proof. Observation IC.6] establishes that in an oscillation there must be infinitely many time steps 
in which the actions of nodes 2, . . . , n are Consider one such moment in time. Observe that in 

the subsequent time steps nodes' action profiles will inevitably change as in S (given our definition 
of nodes' 3, . . . ,n reaction functions). Thus, once the action profile is no longer there are at 
least |5| — 1 time steps until it goes back to being Observe that if 1 and 2 are activated at 

some point in the intermediate time steps (which is guaranteed as r < 15*1) then the actions of both 
shall be y and so convergence to is guaranteed. □ 

Claim C.8. If r = \S\ then an oscillation is possible. 

Proof. The oscillation is as follows. Start at x"" and activate both 1 and 2 (this will not change 
the action profile). In the 15"! — 1 subsequent time steps activate all nodes but 1 and 2 until x" is 
reached again. Repeat ad infinitum. □ 

□ 

We note that the construction in the proof of Theorem 16.21 is such that there is a unique stable 
state. We believe that the same ideas can be used to prove the same result for systems with 
multiple stable states but the exact way of doing this eludes us at the moment, and is left as an 
open question. 

Problem C.9. Prove that for every sufficiently large n € N, there exists a system G with n 
nodes, in which each node i has two possible actions and each fi is deterministic, historyless and 
self-independent, such that 
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1. G is r-convergent for r G 17(2"); 

2. G is not (r + 1)- convergent; 

3. There are multiple stable states in G. 

C. 2 Does Random Choice (of Initial State and Schedule) Help? 

Theorem 14.11 tells us that a system with multiple stable states is nonconvergent if the initial state 
and the node-activation schedule are chosen adversarially. Can we guarantee convergence if the 
initial state and schedule are chosen at random! 

Example C.IO. (random choice of initial state and schedule might not help) There are n 
nodes, 1, . . . ,n, and each node has action space {x^y^z}. The (deterministic, historyless and self- 
independent) reaction function of each node i € {3, . . . , n} is such that fiix"^'^) = x; fi{z'^~^) = z; 
and fi = y for all other inputs. The (deterministic, historyless and self-independent) reaction 
function of each node i G {1,2} is such that fi{x"-~^) = x; fi{z"-~^) = z; /j(xy"~^) = y; fi{y"'~^) = 
x; and fi = y for all other inputs. Observe that there are exactly two stable states: x" and z". 
Observe also that if nodes' actions in the initial state do not contain at least n — 1 x's, or at least 
n — 1 z's, then, from that moment forth, each activated node in the set {3, . . . , n} will choose the 
action y. Thus, eventually the actions of all nodes in {3, . . . , n} shall be y, and so none of the two 
stable states will be reached. Hence, there are 3" possible initial states, such that only from An + 2 
can a stable state be reached. 

Example IC . 101 presents a system with multiple stable states such that from most initial states all 
possible choices of schedules do not result in a stable state. Hence, when choosing the initial state 
uniformly at random the probability of landing on a "good" initial state (in terms of convergence) 
is exponentially small. 

D Complexity of Asynchronous Dynamics 

We now explore the communication complexity and computational complexity of determining 
whether a system is convergent. We present hardness results in both models of computation 
even for the case of deterministic and historyless adaptive heuristics. Our computational complex- 
ity result shows that even if nodes' reaction functions can be succinctly represented, determining 
whether the system is convergent is PSPACE-complete. This intractability result, alongside its 
computational implications, implies that we cannot hope to have short "witnesses" of guaranteed 
asynchronous convergence (unless PSPACE C NP). 

D. l Communication Complexity 

We prove the following communication complexity result, that shows that, in general, determining 
whether a system is convergent cannot be done efficiently. 

Theorem D.l. Determining if a system with n nodes, each with 2 actions, is convergent requires 
il(2") hits. This holds even if all nodes have deterministic, historyless and self-independent reaction 
functions. 
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Proof. To prove our result we present a reduction from the following well-known problem in com- 
munication complexity theory. 

2-party SET DISJOINTNESS: There are two parties, Alice and Bob. Each party holds a subset of 
{1, . . . ,q}; Alice holds the subset E"^ and Bob holds the subset . The objective is to determine 
whether E^ n E^ = 0. The following is well known. 

Theorem D.2. Determining whether E^ DE^ = requires (in the worst case) the communication 
of Q{q) bits. This lower bound applies to randomized protocols with bounded 2-sided error and also 
to nondeterministic protocols. 

We now present a reduction from 2-party SET DISJOINTNESS to the question of determining 
whether a system with deterministic, historyless and self-independent reaction functions is conver- 
gent. Given an instance of SET-DISJOINTNESS we construct a system with n nodes, each with 
two actions, as follows (the relation between the parameter q in SET DISJOINTNESS and the 
number of nodes n is to be specified later). Let the action space of each node be {x,y}. We now 
define the reaction functions of the nodes. Consider the possible action profiles of nodes 3, . . . , n, 
i.e., the set Observe that this set of actions can be regarded as the (n — 2)-hypercube 

and thus can be visualized as the graph whose vertices are indexed by the binary (n — 2)- 
tuples and such that two vertices are adjacent if and only if the corresponding (n — 2)-tuples differ 
in exactly one coordinate. 

Definition D.3. (chordless paths, snakes) A chordless path in a hypercube Qn is a path P = 
{vq, . . . ,Vw) such that for each Vi,Vj on P, if Vi and Vj are neighbors in Qn then vj G {vi-i,Vi+i}. 
A snake in a hypercube is a simple chordless cycle. 

The following result is due to Abbot and Katchalski [T]. 

Theorem D.4. /i/ Let t G N 6e sufficiently large. Then, the size \S\ of a maximal snake in the 
z-hypercube Qz is at least A x 2^ for some A > 0.3. 

Hence, the size of a maximal snake in the Qn-2 hypercube is 0(2"). Let S be a maximal snake 
in We now show our reduction from SET DISJOINTNESS with q = \S\. We identify 

each element j G {1 . . . ,q} with a unique vertex vj G S. W.l.o.g we can assume that x"~^ is on S 
(otherwise we can rename nodes' actions to achieve this). For ease of exposition we also assume 
that y""^ is not on S (getting rid of this assumption is easy). Nodes' reaction functions are as 
follows. 

• Node 1: If Vj = (vj^i, . . . , Vj^n-2) G is a vertex that corresponds to an element j G E^, then 
fi{y,Vj,i, ■ ■ ■ ,'Uj,n-2) = x; otherwise, /i outputs y. 

• Node 2: If vj = [vj^i, . . . , Vj^n-2) G S" is a vertex that corresponds to an element j G E^ , then 
f2{y,Vj,i, . . . ,-Uj,n-2) = x; otherwise, /2 outputs y. 

• Node i G {3, . . . ,n}: if the actions of nodes 1 and 2 are not both x then the action y is 
chosen; otherwise, fi only depends on actions of nodes in {3, ... , n} and therefore to describe 
fi it suffices to orient the edges of the hypercube Qn-2 (an edge from one vertex to another 
vertex that differs from it in the ith coordinate determines the outcome of fi for both). This 
is done as follows: orient the edges in S so as to create a cycle (in one of two possible ways); 
orient edges between vertices not in S to vertices in S towards the vertices in S; orient all 
other edges arbitrarily. 
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Observation D.5. is the unique stable state of the system. 

In our reduction Alice simulates node 1 (whose reaction function is based on E^), Bob simulates 
node 1 (whose reaction function is based on E^), and one of the two parties simulates all other 
nodes (whose reaction functions are not based on neither E^ nor E^). The theorem now follows 
from the combination of the following claims: 

Claim D.6. In an oscillation it must be that there are infinitely many time steps in which both 
node 1 and 2's actions are x. 

Proof. By contradiction. Consider the case that from some moment forth it is never the case that 
both node 1 and 2's actions are x. Observe that from that time onwards the nodes 3, . . . , n will 
always choose the action y. Hence, after some time has passed the actions of all nodes in {3, . . . , n} 
will be y. Observe that whenever nodes 1 and 2 are activated thereafter they shall choose the action 
y and so we have convergence to the stable state y". □ 

Claim D.7. The system is not convergent iff E"^ fl E^ ^ 0. 

Proof. We know (Claim [D.6P that if there is an oscillation then there are infinitely many time steps 
in which both node 1 and 2's actions are x. We argue that this implies that there must be infinitely 
many time steps in which both nodes select action x simultaneously. Indeed, recall that node 1 
only chooses action x if node 2's action is y, and vice versa, and so if both nodes never choose x 
simultaneously, then it is never the case that both nodes' actions are x at the same time step (a 
contradiction). Now, when is it possible for both 1 and 2 to choose x at the same time? Observe 
that this can only be if the actions of the nodes in {3, . . . , n} constitute an element that is in both 
E"^ and E^ . Hence, E^r\E^ □ 

□ 



D.2 Computational Complexity 

The above communication complexity hardness result required the representation of the reaction 
functions to (potentially) be exponentially long. What if the reaction functions can be succinctly 
described? We now present a strong computational complexity hardness result for the case that 
each reaction function fi is deterministic and historyless, and is given explicitly in the form of a 
boolean circuit (for each a ^ A the circuit outputs fi{a)). 



Theorem \ 7. ^| 1. Determining if a system with n nodes, each with a deterministic and historyless 



reaction function, is convergent is PSPACE-complete. 

Proof. Our proof is based on the proof of Fabrikant and Papadimitriou [S] that BGP safety is 
PSPACE-complete. Importantly, the result in [9] does not imply Theorem 17. 21 since [9j only consid- 
ers dynamics in which nodes are activated one at a time. We present a reduction from the following 
problem. 

STRING NONTERMINATION: The input is a function c/ : F* ^ F U {halt], for some alphabet F, 
given in the form of a boolean circuit. The objective is to determine whether there exists an initial 
string T = (Tq, . . . , Tt-i) € F* such that the following procedure does not halt. 

1. i:=0 
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2. While g{T) ^ halt do 

• := g{T) 

• i := {i + 1) modulu t 

STRING NONTERMINATION is closely related to STRING HALTING from [9j and is also 
PSPACE-complete. We now present a reduction from STRING NONTERMINATION to the 
question of determining whether a system with deterministic and historyless reaction functions 
is convergent. 

We construct a system with n = t + 1 nodes. The node set is divided into t ^Hndex nodes" 
0, . . . ,t — 1 and a single counter node" x. The action space of each index node is E U {halt} and 
the action space of the counter node is {0, . . . , i — 1} x (E U {halt}). Let a = (ao, • • • , at-i,ax) be 
an action profile of the nodes, where ax = (j, 7) is the action of the counter node. We now define 
the deterministic and historyless reaction functions of the nodes: 

• The reaction function of index node i G {0, . . . , t — 1}, /«: if 7 = halt, then fi{a) = halt; 
otherwise, if j = i, and aj / 7, then fi{a) = 7; otherwise, fi{a) = ai. 

• The reaction function of the counter node, fx'- if 7 = halt, then fx{(^) = o-x] if o-j = 7, then 

= iU + 1) modulu t,g{ao, . . . ,at-i); otherwise fi{a) = ax- 
The theorem now follows from the following claims that, in turn, follow from our construction: 

Claim D.8. (halt, . . . , halt) is the unique stable state of the system. 

Proof. Observe that {halt, . . . , halt) is indeed a stable state of the system. The uniqueness of this 
stable state is proven via a simple case-by-case analysis. □ 

Claim D.9. // there exists an initial string T = (Tq, . . . ,Tt-i) for which the procedure does not 
terminate then there exists an initial state from which the system does not converge to the stable 
state (halt, . . . , halt) regardless of the schedule chosen. 

Proof. Consider the evolution of the system from the initial state in which the action of index node 
i is Ti and the action of the counter node is {0,g{T)). □ 

Claim D.IO. If there does not exist an initial string T for which the procedure does not terminate 
then the system is convergent. 

Proof. Observe that if there is an initial state a = (oq, . . . ,at-i,ax) and a fair schedule for which 
the system does not converge to the unique stable state then the procedure does not halt for the 
initial string T = (oq, • • • , at-i)- D 

□ 

Proving the above PSPACE-completeness result for the case self-independent reaction functions 
seems challenging. 

Problem D.ll. Prove that determining if a system with n nodes, each with a deterministic self- 
independent and historyless reaction function, is convergent is PSPACE-complete. 
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E Some Basic Observations Regarding No-Regret Dynamics 



Regret minimization is fundamental to learning theory. The basic setting is as follows. There is a 
space of m actions {e.g., possible routes to work), which we identify with the set [m] = {1, . . . , m}. 
In each time step t £ {!,...}, an adversary selects a profit function pt : [m] — > [0,1] {e.g., how 
fast traffic is flowing along each route), and the (randomized) algorithm chooses a distribution Dt 
over the elements in [m]. When choosing Dt the algorithm can only base its decision on the profit 
functions pi, . . . ,Pt-i, and not on pt (that is revealed only after the algorithm makes its decision). 
The algorithm's gain at time t is gt = Dt{j)pt{j), and its accumulated gain at time t is 

5^i=i 9t ■ Regret analysis is useful for designing adaptive algorithms that fair well in such uncertain 
environments. The motivation behind regret analysis is ensuring that, over time, the algorithm 
performs at least as well in retrospect as some alternative "simple" algorithm. 

We now informally present the three main notions of regret (see [3] for a thorough explanation) : 
(1) External regret compares the algorithm's performance to that of simple algorithms that select 
the exact same action in each time step {e.g., "you should have always taken Broadway, and 
never chosen other routes"). (2) Internal regret and swap regret analysis compares the gain from 
the sequence of actions actually chosen to that derived from replacing every occurrence of an 
action i with another action j {e.g., "every time you chose Broadway you should have taken 
7th Avenue instead). While internal regret analysis allows only one action to be replaced by 
another, swap regret analysis considers all mappings from [m] to [m]. The algorithm has no 
(external/internal/swap) regret if the gap between the algorithm's gain and the gain from the best 
alternative policy allowed vanishes with time. 

Regret minimization has strong connections to game-theoretic solution concepts. If each player 
in a repeated game executes a no-regret algorithm when selecting strategies, then convergence to 
an equilibrium is guaranteed in a variety of interesting contexts. The notion of convergence, and 
the kind of equilibrium reached, vary, and are dependent on the restrictions imposed on the game 
and on the type of regret being minimized {e.g., in zero-sum games, no-external-regret algorithms 
are guaranteed to approach or exceed the minimax value of the game; in general games, if all 
players minimize swap regret, then the empirical distribution of joint players' actions converges to 
a correlated equilibrium, etc.). (See [3] and references therein). Importantly, these results are all 
proven within a model of interaction in which each player selects a strategy in each and every time 
step. 

We make the following simple observation. Consider a model in which the adversary not only 
chooses the profit functions but also has the power not to allow the algorithm to select a new 
distribution over actions in some time steps. That is, the adversary also selects a schedule a such 
that Vt G N+, a{t) G {0,1}, where and 1 indicate whether the algorithm is not activated, or 
activated, respectively. We restrict the schedule to be r-fair, in the sense that the schedule chosen 
must be such that the algorithm is activated at least once in every r consecutive time steps. If 
the algorithm is activated at time t and not activated again until time t + (3 then it holds that 
\/ s & {t + \, . . . ,t + j3 — 1}, Dg = Dt (the algorithm cannot change its probability distribution over 
actions while not activated). We observe that if an algorithm has no regret in the above setting (for 
all three notions of regret), then it has no regret in this setting as well. To see this, simply observe 
that if we regard each batch of time steps in which the algorithms is not activated as one "meta 
time step" , then this new setting is equivalent to the traditional setting (with pt : [rri\ — t- [0, r] for 
ah t G N+). 

This observation, while simple, is not uninteresting, as it implies that all regret-based results for 
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repeated games continue to hold even if players' order of activation is asynchronous (see Section [3] 
for a formal exposition of asynchronous interaction), so long as the schedule of player activations 
is r-fair for some r € A^+. We mention two implications of this observation. 

Observation E.l. When all players in a zero-sum game use no-external-regret algorithms then 
approaching or exceeding the minimax value of the game is guaranteed. 

Observation E.2. When all players in a (general) game use no- swap-regret algorithms the empir- 
ical distribution of joint players' actions converges to a correlated equilibrium of the game. 

Problem E.3. Give examples of repeated games for which there exists a schedule of player acti- 
vations that is not r-fair for any r G A^_|_ for which regret-minimizing dynamics do not converge to 
an equilibrium (for different notions of regret/ convergence/ equilibria). 

Problem E.4. When is convergence of no-regret dynamics to an equilibrium guaranteed (for dif- 
ferent notions of regret/ convergence/ equilibria) for all r-fair schedules for non-fixed r's, that is, if 
when r is a function of t? 

F An Axiomatic Approach 

We now use (a slight variation of) the framework of Taubenfeld, which he used to study resilient 
consensus protocols [31], to prove Thm. HTTl We first (Sec. IF.2]) define runs as sequences of events; 
unlike Taubenfeld, we allow infinite runs. A protocol is then a collection of runs (which must satisfy 
some natural conditions like closure under taking prefixes) . We then define colorings of runs (which 
correspond to outcomes that can be reached by extending a run in various ways) and define the 
loD property. 

The proof of Thm. 14.11 proceeds in two steps. First, we show that any protocol that satisfies 
loD has some (fair, as formalized below), non-terminating activation sequence. We then show that 
protocols that satisfy the hypotheses of Thm. |4T] also satisfy loD. 

F.l Proof Sketch 

Proof Sketch. The proof follows the axiomatic approach of Taubenfeld [31] in defining asynchronous 
protocols in which states are colored by sets of colors; the set of colors assigned to a state must 
be a superset of the set of colors assigned to any state that is reachable (in the protocol) from 
it. We then show that any such protocol that satisfies a certain pair of properties (which we call 
Independence of Decisions or loD) and that has a polychromatic state must have a non-terminating 
fair run in which all states are polychromatic. 

For protocols with 1-recall, self-independence, and stationarity, we consider (in order to reach 
a contradiction) protocols that are guaranteed to converge. Each starting state is thus guaranteed 
to reach only stable states; we then color each state according to the outcomes that are reachable 
from that state. We show that, under this coloring, such protocols satisfy loD and that, as in 
consensus protocols, the existence of multiple stable states implies the existence of a polychromatic 
state. The non-terminating, polychromatic, fair run that is guaranteed to exist is, in the context, 
exactly the non-convergent protocol run claimed by the theorem statement. We then show that 
this may be extended to non-stationary protocols with fc-recall (for k > 1). □ 
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F.2 Events, Runs, and Protocols 

Events are the atomic actions that are used to build runs of a protocol. Each event is associated 
with one or more principals; these should be thought of as the principals who might be affected 
by the event {e.g., as sender or receiver of a message), with the other principals unable to see the 
event. We start with the following definition. 

Definition F.l (Events and runs). There is a set E whose elements are called events; we assume 
a finite set of possible events (although there will be no restrictions on how often any event may 
occur). There is a set V of principals; each event has an associated set 5 C "P, and if S is the set 
associated to e G E, we will write es- 

There is a set TZ whose elements are called runs; each run is a (possibly infinite) sequence of 
events. We say that event e is enabled at run x if the concatenation (x; e) {i.e., the sequence of 
events that is x followed by the single event e) is also a run. (We will require that TZ be prefix-closed 
in the protocols we consider below.) 

The definition of a protocol will also make use of a couple types of relationship between runs; 
our intuition for these relationships continues to view ep as meaning that event e affects the set 
P of principals. From this intuitive perspective, two runs are equivalent with respect to a set S of 
principals exactly when their respective subsequences that affect the principals in S are identical. 
We also say that one run includes another whenever, from the perspective of every principal {i.e., 
restricting to the events that affect that principal), the included run is a prefix of the including 
run. Note that this does not mean that the sequence of events in the included run is a prefix of 
the sequence of events in the including run — events that affect disjoint sets of principals can be 
reordered without affecting the inclusion relationship. 

Definition F.2 (Run equivalence and inclusion). For a run x and S ^ V, we let X5 denote the 
subsequence (preserving order and multiplicity) of events ep in x for which PnS 7^ 0. We say that 
X and y are equivalent with respect to S, and we write x[S']y, if xg = ys- We say that y includes 
X if for every node i, the restriction of x to those events ep with i € P is a prefix of the restriction 
of y to such events. 

Our definitions of xg and x[S']y generalize definitions given by Taubenfeld (31] for [5*1 = 1 — 
allowing us to consider events that are seen by multiple principals — but other than this and the 
allowance of infinite runs, the definitions we use in this section are the ones he used. Importantly, 
however, we do not use the resilience property that Taubenfeld used. 

Finally, we have the formal definition of an asynchronous protocol. This is a collection of runs 
that is closed under taking prefixes and only allows for finitely many (possibly 0) choices of a next 
event to extend the run. It also satisfies the property {P2 below) that, if a run can be extended by 
an event that affects exactly the set S of principals, then any run that includes this run and that is 
equivalent to the first run with respect to S (so that only principals not in S see events that they 
don't see in the first run) can also be extended by the same event. 

Definition F.3 (Asynchronous protocol). An asynchronous protocol (or just a protocol) is a col- 
lection of runs that satisfies the following three conditions. 
Pi Every prefix of a run is a run. 

P2 Let (x; es) and y be runs. If y includes x, and if x[S']y, then (y; e^) is also a run. 
P3 Only finitely many events are enabled at a run. 
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F.3 Fairness, Coloring, and Decisions 

We start by recalling the definition of a fair sequence [31]; as usual, we are concerned with the 
behavior of fair runs. We also introduce the notion of a fair extension, which we will use to 
construct fair infinite runs. 

Definition F.4 (Fair sequence, fair extension). We define a fair sequence to be a sequence of 
events such that: every finite prefix of the sequence is a run; and, if the sequence is finite, then no 
event is enabled at the sequence, while if the sequence is infinite, then every event that is enabled 
at all but finitely many prefixes of the sequence appears infinitely often in the sequence. We define 
a fair extension of a (not necessarily fair) sequence x to be a finite sequence ei, 62, . . . , of events 
such that ei is enabled at x, 62 is enabled at (x; ei), etc. 

We also assign a set of "colors" to each sequence of events subject to the conditions below. 
As usual, the colors assigned to a sequence will correspond to the possible protocol outcomes that 
might be reached by extending the sequence. 

Definition F.5 (Asynchronous, C-chromatic protocol). Given a set C (called the set of colors), 
we will assign sets of colors to sequences; this assignment may be a partial function. For a set C, 
we will say that a protocol is C-chromatic if it satisfies the following properties. 
Ci For each c G C, there is a protocol run of color {c}. 

C2 For each protocol run x of color C C C, and for each c € C, there is an extension of x that 
has color {c}. 

C3 If y includes x and x has color C , then the color of y is a subset of C . 

We say that a fair sequence is polychromatic if the set of colors assigned to it has more than one 
element. 

Finally, a C-chromatic protocol is called a decision protocol if it also satisfies the following 
property: 

D Every fair sequence has a finite monochromatic prefix, i.e., a prefix whose color is {c} for some 
c G C. 

F.4 Independence of Decisions (loD) 

We turn now to the key (two-part) condition that we use to prove our impossibility results. 

Definition F.6 (Independence of Decisions (loD)). A protocol satisfies Independence of Decisions 
(loD) if, whenever 

• a run x is polychromatic and 

• there is some event e is enabled at x and (x; e) is monochromatic of color {c}, 
then 

1. for every e' 7^ e that is enabled at x, the color of (x; e') contains c, and 

2. for every e' 7^ e that is enabled at x, if ((x; e') ; e) is monochromatic, then its color is also {c}. 

Figure [1] illustrates the two conditions that form loD. Both parts of the figure include the 
polychromatic run x that can be extended to (x; e) with monochromatic color {c}; the color of 
X necessarily includes c. The left part of the figure illustrates condition [U and the right part of 
the figure illustrates condition [2j The dashed arrow indicates a sequence of possibly many events. 
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Figure 1: Illustration of the two conditions of loD. 



while the solid arrows indicate single events. The labels on a node in the figure indicate what is 
assumed/required about the set that colors the node. 

Condition [1] essentially says that, if an event e decides the outcome of the protocol, then no 
other event can rule out the outcome that e produced. The name "Independence of Decisions" 
derives from condition [21 which essentially says that, if event e decides the outcome of the protocol 
both before and after event e', then the decision that is made is independent of whether e' happens 
immediately before or after e. 

In working with loD-satisfying protocols, the following lemma will be useful. 

Lemma F.7. If loD holds, then for any two events e and e' that are enabled at a run x, if both 
(x; e) and (x; e') are monochromatic, then those colors are the same. 

Proof. By loD, the color of (x;e') must contain the color of (x;e), and both of these sets are 
singletons. □ 

F.5 loD-Satisfying Protocols Don't Always Converge 

To show that loD-satisfying protocols don't always converge, we proceed in two steps: first, we 
show (Lemma IF.SP that a polychromatic sequence can be fairly extended (in the sense of ... ) to 
another polychromatic sequence; second, we use that lemma to show (Thm. IF.9P .... 

Lemma F.8 (The Fair-Extension Lemma). In a polychromatic decision protocol that satisfies loD, 
if a run x is polychromatic, then x can be extended by a fair extension to another polychromatic 
run. 

Proof. Assume that, for some C", there is a run x of color C" that cannot be fairly extended to 
another polychromatic run. Because \C'\ > 1, there must be some event that is enabled at x; if 
not, we would contradict D. Figure [2] illustrates this (and the arguments in the rest of the proof 
below). 

Consider the extensions of x that use as many distinct events as possible and that are poly- 
chromatic, and pick one of these y that minimizes the number of events that are enabled at every 
prefix of y (after x has already been executed) but that do not appear in y. If y contains no events 
(illustrated in the top left of Fig. [2|), then every event e that is enabled at x is such that (x;e) 
is monochromatic. By Lemma \F.7\ these singletons must all be the same color {c}; however, this 
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<x;y> 




<x;y> 




Figure 2: Illustration of proof of Lem. IF.8[ 

means that for c' G C \ {c} 7^ 0, x does not have any extensions whose color is c', contradicting 

If y contains one or more events (illustrated in the top right and bottom of Fig. [2]), then 
(because it is not a fair extension of x) there is at least one event e that is enabled everywhere 
in the extension, including at (x; y) , but that does not appear anywhere in y . Because y was 
chosen instead of (y; e) (or another extension with the same number of distinct events), the color 
of ((x;y) ; e) must be a singleton {c}. Because (x; y) is polychromatic, it has some extension z 
that is (eventually) monochromatic with color {d} 7^ {c}; let e' be the first event in this extension. 
Because loD is satisfied, the color of ({x; y) ; e') also contains c and is thus polychromatic. The event 
e is again enabled here (else ((x;y) ; e') would have been chosen instead of y). If (((x;y) ; e') ; e) 
is not monochromatic (top right of Fig. [2]), then it is a polychromatic extension of x that uses 
more distinct events than does y, a contradiction. If (((x;y) ;e') ; e) is monochromatic (bottom of 
Fig. [2|), then by loD it has color {c}. We may then inductively move along the extension z; after 
each additional event from z is appended to the run, the resulting run is polychromatic (its color 
set must include d, but if it is monochromatic it must have color {c}) and again enables e (by our 
choice of y). Again by our choice of y, appending e to this run must produce a monochromatic 
run, which (by loD) must have color {c}. Proceeding along z, we must then eventually reach a 
polychromatic run at which e is enabled (and produces a monochromatic run of color {c}) and 
which also enables a different event that yields a monochromatic run of color {d}. This contradicts 
Lem. Ell □ 

Theorem F.9. Any \oD -satisfying asynchronous protocol with a polychromatic initial state has a 
fair sequence that starts at this initial state and never reaches a decision, i.e., it has a fair sequence 
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that does not have a monochromatic prefix. 



Proof. Start with the empty (polychromatic) run and iteratively apply the fair-extension lemma to 
obtain an infinite polychromatic sequence. If an event e is enabled at all but finitely many prefixes 
in this sequence, then in all but finitely many of the fair extensions, e is enabled at every step of 
the extension. Because these extensions are fair (in the sense of Def. IF.4p . e is activated in each 
of these (infinitely many) extensions and so appears infinitely often in the sequence, which is thus 
fair. □ 



F.6 1-Recall, Stationary, Self-Independent Protocols Need Not Converge 

We first recall the statement of Thm. 14. 1[ We then show that 1-recall, historyless protocols satisfy 
loD when colored as in Def. IF.lOi Theorem IF. 91 then implies that such protocols do not always 
converge; it immediately follows that this also applies to bounded-recall (and not just 1-recall) 
protocols. 

Theorem 14.11 // each node i has bounded recall, and each reaction function fi is self-independent 
and stationary, then the existence of two stable states implies that the computational network is not 
safe. 

Definition F.IO (Stable coloring). In a protocol defined as in Sec.[3l the stable coloring of protocol 
states is the coloring that has a distinct color for each stable state and that colors each state in a 
run with the set of colors corresponding to the stable states that are reachable from that state. 

We model the dynamics of a 1-recall, historyless protocol as follows. There are two types of 
actions: the application of nodes' reaction functions, where Cj is the action of node i acting as 
dictated by fi, and a "reveal" action W. The nodes scheduled to react in the first timestep do so 
sequentially, but these actions are not yet visible to the other nodes (so that nodes after the first 
one in the sequence are still reacting to the initial state and not to the actions performed earlier in 
the sequence). Once all the scheduled nodes have reacted, the W action is performed; this reveals 
the newly performed actions to all the other nodes in the network. The nodes that are scheduled 
to react at the next timestep then act in sequence, followed by another W action, and so on. This 
converts the simultaneous-action model of Sec.[3]to one in which actions are performed sequentially; 
we will use this "act-and-tell" model in the rest of the proof. We note that all actions are enabled 
at every step (so that, e.g., Ci can be taken multiple times between W actions; however, this is 
indistinguishable from a single Cj action because the extra occurrences are not seen by other nodes, 
and they do not affect i's actions, which are governed by a historyless reaction function). 

Once we cast the dynamics of 1-recall, historyless protocols in the act-and-tell model, the 
following lemma will be useful. 

Lemma F.ll (Color equalities). In a 1-recall, historyless protocol (in the act-and-tell model): 

1. For every run pair of runs x, y and every i G [n], the color of {{^;eiWeiW) ;y) is the same 
as the color of ((x; WeiW) ; y) . 

2. For every run pair of runs x, y and every i,j € [n], the color of ((x;ejej) ; y) is the same as 
the color of ((x; CjCi) ; y) . 

Informally, the first color equality says that, if all updates are announced and then i activates and 
then all updates are revealed again (i's new output being the only new one), it makes no difference 
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whether or not i was activated immediately before the first reveal action. The second color equality 
says that, as long as there are no intervening reveal event, the order in which nodes compute their 
outputs does not matter (because they do not have access to their neighbors' new outputs until the 
reveal event). 

Proof. For the first color equality, because the protocol is self-independent, the first occurrence of 
Ci (after x) in {{-x;eiWeiW) ;y) does not affect the second occurrence of €{. Because the protocol 
has 1-recall, the later events (in y) are also unaffected. 

The second color equality is immediate from the definition of the act-and-tell model. □ 

Lemma F.12. // a protocol is 1-recall and historyless, then the protocol (with the stable coloring) 
satisfies loD. 

Proof. Color each state in the protocol's runs according to the stable states that can be reached 
from it. Assume x is a polychromatic run (with color C) and that some event e is such that 
(x; e) is monochromatic (with color {c}). Let e' be another event (recall that all events are always 
enabled). If e and e' are two distinct node events Cj and ej {i / j), respectively, then the color of 
{{x;ej) ;ej) is the color of ((x;ei) ;ej) and thus the (monochromatic) color of (x;ej), i.e., {c}. If e 
and e' are both W or are the same node event Cj, then the claim is trivial. 

If e = Cj and e' = W (as illustrated in the left of Fig. [3|), then we may extend (x; ej) by WeiW 
to obtain a run whose color is again {c}. By the second color equality, this is also the color of the 
extension of (x; W) by CiW , so the color of (x; W) contains c and if the extension of (x; W) by 
Ci is monochromatic, its color must be {c} as well. If, on the other hand, e = W and e' = Cj (as 
illustrated in the right of Fig. [3]), we may extend (x; TV) by CiW and (x;ei) by WciW to obtain 
runs of color {c}; so the color of (x; e^) must contain c and, arguing as before, if the intermediate 
extension ((x; e^) ; W) is monochromatic, its color must also be {c}. □ 



X 



X 




Figure 3: Illustrations of the arguments in the proof of Lem. IF.12[ 
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Lemma F.13. If a l-recall, historyless computation that always converges can, for different starting 
states, converge to different stable states then there is some input from which the computation can 
reach multiple stable states. In particular, under the stable coloring, there is a polychromatic state. 

Proof. Assume there are (under the stable coloring) two different monochromatic input states for 
the computation, that the inputs differ only at one node v, and that the computation always 
converges (i.e., for every fair schedule) on both input states. Consider a fair schedule that activates 
V first and then proceeds arbitrarily. Because the inputs to v's reaction function are the same in 
each case, after the first step in each computation, the resulting two networks have the same node 
states. This means that the computations will subsequently unfold in the same way, in particular 
producing identical outputs. 

If a historyless computation that always converges can produce two different outputs, then 
iterated application of the above argument leads to a contradiction unless there is a polychromatic 
initial state. □ 

Proof of l-recall, stationary part of Thm. \4-l\ Consider a protocol with l-recall, self independence, 
and stationarity, and that has two different stable states. If there is some non-convergent run of 
the protocol, then the network is not safe (as claimed). Now assume that all runs converge; we 
will show that this leads to a contradiction. Color all states in the protocol's runs according to the 
stable coloring (Def. IF.lOp . Lemma lF.131 implies that there is a polychromatic state. Because, by 
Lem. [F\T2l the loD is satisfied, we may apply Thm. [K9l In this context (with the stable coloring), 
this implies that there is an infinite run in which every state can reach at least two stable states; 
in particular, the run does not converge. □ 



F.7 Extension to Non-stationary Protocols 

We may extend our results to non-stationary protocols as well. 

Theorem F.14. If each node i has l-recall, the action spaces are all finite, and each reaction 
function fi is self-independent but not necessarily stationary, then the existence of two stable states 
implies that the computational network is not safe. 

Proof. In this context, a stable state is a vector of actions and a time t such that, after t, the 
action vector is a fixed point of the reaction functions. Let T be the largest such t over all the 
(finitely many) stable states (and ensure that T is at least k for generalizing to /c-recall). Assume 
that the protocol is in fact safe; this means that, under the stable coloring, every state gets at least 
one color. If there are only monochromatic states, consider the states at time T; we view two of 
these states as adjacent if they differ only in the action (or action history for the generalization 
to fc-recall) of one node. Because the protocol is self-independent, that node may be activated {k 
times if necessary) to produce the same state. In particular, this means that adjacent states must 
have the same monochromatic color. Because (among he states at time T) there is a path (following 
state adjacencies) from any one state to any other, only one stable state is possible, contradicting 
the hypotheses of the theorem. 

Considering the proof of Lem. IF. 121 we see that the number of timesteps required to traverse 
each of the subfigures in Fig. [3] does not depend on which path (left or right) through the subfigure 
we take. In particular, this means that the reaction functions are not affected by the choice of 
path. Furthermore, the non-VF actions in each subfigure only involve a single node i; the final 
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action performed by i along each path occurs after one W action has been performed (after x), so 
these final actions are the same (because the timesteps at which they occur are the same, as are 
the actions of all the other nodes in the network). □ 



F. 8 Extension to Bounded-Recall Protocols 

If we allow A;-recall for A: > 1, we must make a few straightforward adjustments to the proofs above. 
Generalizing the argument used in the proof of the color equalities (Lem. IF. lip , we may prove an 
analogue of these for fc-recall; in particular, we replace the first color equality by an equality between 
the colors of ((x; ejVF(ejH^)'^) ;y) and ((x;W{eiW)'') ;y). This leads to the analogue of Lem. IF. 121 
for bounded-recall protocols; as in Lem. IF. 121 the two possible paths through each subfigure (in 
the fc-recall analogue of Fig. [3|) require the same number of timesteps, so non-stationarity is not a 
problem. 

Considering adjacent states as those that differ only in the actions of one node (at some point 
in its depth- A; history), we may construct a path from any monochromatic initial state to any other 
such state. Because the one node that differs between two adjacent states may be (fairly) activated 
k times to start the computation, two monochromatic adjacent states must have the same color; as 
in the 1-recall case, the existence of two stable states thus implies the existence of a polychromatic 
state. 

G Implications for Resilient Decision Protocols 

The consensus problem is fundamental to distributed computing research. We give a brief descrip- 
tion of it here, and we refer the reader to [31] for a detailed explanation of the model. We then 
show how to apply our general result to this setting. This allows us to show that the impossibility 
result in [12], which shows that no there is no protocol that solves the consensus problem, can be 
obtained as a corollary of Thm. IF. 91 

G. l The Consensus Problem 

Processes and consensus. There are N > 2 processes 1,. . . ,N, each process i with an initial 
value Xi G {0, 1}. The processes communicate with each other via messages. The objective is for 
all non-faulty processes to eventually agree on some value x € {0, 1}, such that x = Xi for some 
i € [A^] (that is, the value that has been decided must match the initial value of some process). No 
computational limitations whatsoever are imposed on the processes. The difficulty in reaching an 
agreement (consensus) lies elsewhere: the network is asynchronous, and so there is no upper bound 
on the length of time processes may take to receive, process and respond to an incoming message. 
Intuitively, it is therefore impossible to tell whether a process has failed, or is simply taking a long 
time. 

Messages and the message buffer. Messages are pairs of the form {p,m), where p is the 
process the message is intended for, and m is the contents of the message. Messages are stored in 
an abstract data structure called the message buffer. The message buffer is a multiset of messages, 
i.e., more than one of any pair (p, m) is allowed, and supports two operations: (1) send(p,m): places 
a message in the message buffer. (2) receive(p): returns a message for processor p (and removes it 
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from the message buffer) or the special value, that has no effects. If there are several messages for 
p in the message buffer then receive(p) returns one of them at random. 

Configurations and system evolution. A configuration is defined by the following two factors: 
(1) the internal state of all of the processors (the current step in the protocol that they are executing, 
the contents of their memory), and (2) the contents of the message buffer. The system moves from 
one configuration to the next by a step which consists of a process p performing receive(p) and 
moving to another internal state. Therefore, the only way that the system state may evolve is 
by some processor receiving a message (or null) from the message buffer. Each step is therefore 
uniquely defined by the message that is received (possibly) and the process that received it. 

Executions and failures. From any initial starting state of the system, defined by the initial 
values of the processes, there are many different possible ways for the system to evolve (as the 
receive(p) operation is non-deterministic). We say that a protocol solves consensus if the objective 
is achieved for every possible execution. Processes are allowed to fail according to the fail-stop 
model, that is, processes that fail do so by ceasing to work correctly. Hence, in each execution, 
non- faulty processes participate in infinitely many steps (presumably eventually just receiving once 
the algorithm has finished its work), while processes that stop participating in an execution at 
some point are considered faulty. We are concerned with the handling of (at most) a single faulty 
process. Hence, an execution is admissible if at most one process is faulty. 

G.2 Impossibility of Resilient Consensus 

We now show how this fits into the formal framework of Ap.[Fl The events are (as in [12]) messages 
annotated with the intended recipient {e.g., mi). In addition to the axioms of Ap.[Fl we also assume 
that the protocol satisfies the following resiliency property, which we adapt from Taubenfeld |31j : 
we call such a protocol a resilient consensus protocol. (Intuitively, this property ensures that if 
node i fails, the other nodes will still reach a decision.) 

Res For each run x and node i, there is a monochromatic run y that extends x such that x [i] y. 

We show that resilient consensus protocols satisfy loD. Unsurprisingly, the proof draws on ideas 
of Fischer, Lynch, and Paterson. 

Lemma G.l. Resilient consensus protocols satisfy loD. 

Proof. Assume x is a polychromatic run of a resilient consensus protocol and that {x;mi) is 
monochromatic (of color {c}). If e' = m'^ for j ^ i, then e = mt and e' commute (because 
the messages are processed by different nodes) and the loD conditions are satisfied. (In particular, 
((x; e) ; e') and ((x; e') ; e) both have the same monochromatic color.) 

If e' = m-, then consider a sequence a from x that reaches a monochromatic run and that does 
not involve i (the existence of a is guaranteed by Res); this is illustrated in Fig. HI Because a 
doesn't involve i, it must commute with e and e'; in particular, the color of the monochromatic 
run reachable by applying a to (x; e) is the same as the color of the run ((x; a) ; e). Thus a must 
produce the same color {c} that e does in extending x. On the other hand, we may apply this same 
argument to e' to see that ((x; e') ; a) must also have the same color as (x; a), so the color of (x; e') 
contains the color of (x;e). The remaining question is whether ((x;e') ; e) can be monochromatic 
of a different color than (x; e). However, the color (if it is monochromatic) of (((x; e') ; e) ; cr) must 
be the same (because a does not involve i) as the color of (((x; e') ; a) ; e), which we have already 
established is the color of (x; e); thus, ((x; e') ; e) cannot be monochromatic of a different color. □ 
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Figure 4: Illustration of argument in the proof of Lem. IG.ll 



Using Thm. IF.9I and the fact that there must be a polychromatic initial configuration for the 
protocol (because it can reach multiple outcomes, as shown in ^12j), we obtain from this lemma the 
following celebrated result of Fischer, Lynch, and Paterson |12| . 

Theorem G.2 (Fischer-Lynch-Paterson [121). There is no always-terminating protocol that solves 
the consensus problem. 
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